Accelerating scientific applications with OpenMP offload on SDSC Cosmos (Advanced Computing Series)

Remote event

Most of the performance improvements in the past decade came from GPU compute. Most compute-heavy codes thus have a strong incentive to transition from CPU-only to GPU-accelerated. Luckily, many parallel programming concepts that apply to CPU-only compute can also be exploited on the GPUs, thus OpenMP offload emerged as one of the easiest paradigms for accelerating existing code bases. Unfortunately, not everything that can be done in CPU-only OpenMP can be done on the GPU, and in this talk we will explore what those limitations are and how to work around them. There are also performance considerations that one must consider, with partitioning of memory between CPU and GPU cores being one major one. We will explore those considerations both in traditional discrete-GPU systems and in more advanced systems, like the AMD MI300A-based SDSC Cosmos.

Instructor

Igor Sfiligoi

Senior Research Scientist - Distributed High-Throughput Computing, SDSC

Igor Sfiligoi has a long history of helping various scientific groups optimize and accelerate their codes on various architecture. He has been working with datacenter-grade GPU systems for almost a decade and has experienced first-hand the pains of GPU-acceleration during that period. He has also seen the rise and fall of various accelerator ecosystems over that time and has a deep appreciation of how dynamic this environment has been over this period. He is also one of the co-PIs of the SDSC Cosmos HPC system.